联合学习(FL)在许多分散的用户中训练全球模型,每个用户都有本地数据集。与传统的集中学习相比,FL不需要直接访问本地数据集,因此旨在减轻数据隐私问题。但是,由于推理攻击,包括成员推理,属性推理和数据反演,FL中的数据隐私泄漏仍然存在。在这项工作中,我们提出了一种新型的隐私推理攻击,创造的偏好分析攻击(PPA),它准确地介绍了本地用户的私人偏好,例如,最喜欢(不喜欢)来自客户的在线购物中的(不喜欢)项目和最常见的表达式从用户的自拍照中。通常,PPA可以在本地客户端(用户)的特征上介绍top-k(即,尤其是k = 1、2、3和k = 1)的偏好。我们的关键见解是,本地用户模型的梯度变化对给定类别的样本比例(尤其是大多数(少数)类别的样本比例具有明显的敏感性。通过观察用户模型对类的梯度敏感性,PPA可以介绍用户本地数据集中类的样本比例,从而公开用户对类的偏好。 FL的固有统计异质性进一步促进了PPA。我们使用四个数据集(MNIST,CIFAR10,RAF-DB和PRODUCTS-10K)广泛评估了PPA的有效性。我们的结果表明,PPA分别达到了MNIST和CIFAR10的90%和98%的TOP-1攻击精度。更重要的是,在实际的购物商业商业场景(即产品-10k)和社交网络(即RAF-DB)中,PPA在前一种情况下,PPA获得了78%的TOP-1攻击精度,以推断出最有序的物品(即作为商业竞争对手),在后一种情况下,有88%来推断受害者用户最常见的面部表情,例如恶心。
translated by 谷歌翻译
We present NeRFEditor, an efficient learning framework for 3D scene editing, which takes a video captured over 360{\deg} as input and outputs a high-quality, identity-preserving stylized 3D scene. Our method supports diverse types of editing such as guided by reference images, text prompts, and user interactions. We achieve this by encouraging a pre-trained StyleGAN model and a NeRF model to learn from each other mutually. Specifically, we use a NeRF model to generate numerous image-angle pairs to train an adjustor, which can adjust the StyleGAN latent code to generate high-fidelity stylized images for any given angle. To extrapolate editing to GAN out-of-domain views, we devise another module that is trained in a self-supervised learning manner. This module maps novel-view images to the hidden space of StyleGAN that allows StyleGAN to generate stylized images on novel views. These two modules together produce guided images in 360{\deg}views to finetune a NeRF to make stylization effects, where a stable fine-tuning strategy is proposed to achieve this. Experiments show that NeRFEditor outperforms prior work on benchmark and real-world scenes with better editability, fidelity, and identity preservation.
translated by 谷歌翻译
关于图像协调的最新作品将问题作为像素图像翻译任务通过大型自动编码器解决。在处理高分辨率图像时,它们的性能不令人满意和缓慢的推理速度。在这项工作中,我们观察到调整基本图像过滤器的输入参数,例如亮度和对比度,足以使人类从复合材料的图像中产生逼真的图像。因此,我们将图像协调作为图像级回归问题,以了解人类用于任务的过滤器的参数。我们提出了一个用于图像协调的谐波框架。与基于黑框自动编码器的先前方法不同,Harmonizer包含用于过滤器参数预测的神经网络,以及用于图像协调的几个白色框过滤器(基于预测参数)。我们还引入了级联回归器和一个动态损失策略,以使和声使更稳定地学习过滤器论点。由于我们的网络仅输出图像级参数和我们使用的过滤器是有效的,因此谐波比现有方法更轻,更快。全面的实验表明,谐波可以超过现有方法,尤其是在高分辨率输入的情况下。最后,我们将谐波应用于视频和谐,以1080p分辨率在框架和56 fps上实现一致的结果。代码和型号可在以下网址提供:https://github.com/zhkkke/harmonizer。
translated by 谷歌翻译
最近,机器学习(ML)电位的发展使得以量子力学(QM)模型的精度进行大规模和长期分子模拟成为可能。但是,对于高水平的QM方法,例如在元gga级和/或具有精确交换的密度函数理论(DFT),量子蒙特卡洛等,生成足够数量的用于训练的数据由于其高成本,计算挑战性。在这项工作中,我们证明了基于ML的DFT模型Deep Kohn-Sham(Deepks)可以在很大程度上缓解这个问题。 DeepKS采用计算高效的基于神经网络的功能模型来构建在廉价DFT模型上添加的校正项。在训练后,DeepKs提供了与高级QM方法相比,具有紧密匹配的能量和力,但是所需的训练数据的数量是比训练可靠的ML潜力所需的数量级要小。因此,DeepKs可以用作昂贵的QM型号和ML电位之间的桥梁:一个人可以生成相当数量的高准确性QM数据来训练DeepKs模型,然后使用DeepKs型号来标记大量的配置以标记训练ML潜力。该周期系统方案在DFT软件包算盘中实施,该计划是开源的,可以在各种应用程序中使用。
translated by 谷歌翻译
由于缺乏深度信息,单眼3D对象检测在自主驾驶中非常具有挑战性。本文提出了一种基于多尺度深度分层的单眼单目眼3D对象检测算法,它使用锚定方法检测每像素预测中的3D对象。在所提出的MDS-Net中,开发了一种新的基于深度的分层结构,以通过在对象的深度和图像尺寸之间建立数学模型来改善网络的深度预测能力。然后开发出新的角度损耗功能,以进一步提高角度预测的精度并提高训练的收敛速度。最终在后处理阶段最终应用优化的软,以调整候选盒的置信度。基蒂基准测试的实验表明,MDS-Net在3D检测中优于现有的单目3D检测方法,并在满足实时要求时进行3D检测和BEV检测任务。
translated by 谷歌翻译
在这项工作中,我们呈现了DCC(更深层兼容的压缩),用于实时无人机的辅助边缘辅助视频分析的一个启用技术,内置于现有编解码器之上。DCC解决了一个重要的技术问题,以将流动的视频从无人机压缩到边缘,而不会严格地在边缘执行的视频分析任务的准确性和及时性。DCC通过流式视频中的每一位对视频分析同样有价值,这是对视频分析的同样有价值,这在传统的分析透视技术编解码器技术上打开了新的压缩室。我们利用特定的无人机的上下文和中级提示,从物体检测中追求保留分析质量所需的自适应保真度。我们在一个展示车辆检测应用中有原型DCC,并验证了其代表方案的效率。DCC通过基线方法减少9.5倍,在最先进的检测精度上,19-683%的速度减少了9.5倍。
translated by 谷歌翻译
我们提出并研究了一个名为“盲图分解”(BID)的新任务,该任务要求将叠加的图像分离为盲点环境中的构成基础图像,也就是说,涉及混合和混合机制的源成分都是未知的。例如,雨水可能由多个组成部分组成,例如雨条,雨滴,雪和阴霾。雨图像可以视为这些组件的任意组合,其中一些或全部。如何将叠加的图像(如多雨图像)分解为不同的源组件是迈向现实世界视觉系统的关键步骤。为了促进对这项新任务的研究,我们构建了多个基准数据集,包括跨多个领域的混合图像分解,实际筛查,以及关节阴影/反射/水印。此外,我们提出了一个简单而通用的盲图分解网络(Biden),以作为未来工作的强大基准。实验结果证明了我们的基准和拜登的有效性。
translated by 谷歌翻译
Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译
As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.
translated by 谷歌翻译
Text clustering and topic extraction are two important tasks in text mining. Usually, these two tasks are performed separately. For topic extraction to facilitate clustering, we can first project texts into a topic space and then perform a clustering algorithm to obtain clusters. To promote topic extraction by clustering, we can first obtain clusters with a clustering algorithm and then extract cluster-specific topics. However, this naive strategy ignores the fact that text clustering and topic extraction are strongly correlated and follow a chicken-and-egg relationship. Performing them separately fails to make them mutually benefit each other to achieve the best overall performance. In this paper, we propose an unsupervised text clustering and topic extraction framework (ClusTop) which integrates text clustering and topic extraction into a unified framework and can achieve high-quality clustering result and extract topics from each cluster simultaneously. Our framework includes four components: enhanced language model training, dimensionality reduction, clustering and topic extraction, where the enhanced language model can be viewed as a bridge between clustering and topic extraction. On one hand, it provides text embeddings with a strong cluster structure which facilitates effective text clustering; on the other hand, it pays high attention on the topic related words for topic extraction because of its self-attention architecture. Moreover, the training of enhanced language model is unsupervised. Experiments on two datasets demonstrate the effectiveness of our framework and provide benchmarks for different model combinations in this framework.
translated by 谷歌翻译